Nucleotide and peptide sequences of a hepatitis C virus isolate, diagnostic and therapeutic applications

ABSTRACT

This invention relates to oligonucleotides encoding HCV E1 peptides, labeled oligonucleotide probes, recombinant DNA molecules comprising HCV E1 nucleotides, plasmids, expression vectors, transformed hosts, analytical kits for detecting nucleotide sequences of hepatitis C virus, and process for preparing polypeptides.

The present invention relates to nucleotide and peptide sequences of aEuropean, more particularly French, strain of the hepatitis C virus, aswell as to the diagnostic and therapeutic applications of thesesequences.

The hepatitis C virus is a major causative agent of infections byviruses previously called "Non-A Non-B" viruses. Infections by the Cvirus in fact now represent the most frequent forms of acute hepatitidesand chronic Non-A Non-B hepatitides (Alter et al. (1), Choo et al., (3);Hopf et al., (5); Kuo et al., (8); Miyamura et al., (11). Furthermore,there is a relationship (the significance of which is still poorlyunderstood) between the presence of anti-HCV antibodies and thedevelopment of primary liver cancers. It has also been shown that thehepatitis C virus is involved in both chronic or acute Non-A Non-Bhepatitides linked to transfusions of blood products or of sporadicorigin.

The genome of the hepatitis C virus has been cloned and the nucleotidesequence of an American isolate has been described in EP-A-0 318 216,EP-A-0 363 025, EP-A-0 388 232 and WO-A-90/14436. Moreover, data iscurrently available on the nucleotide sequences of several Japaneseisolates relating both to the structural region and the nonstructuralregion of the virus (Okamoto et al., (12), Enomoto et al., (4), Kato etal., (6); Takeuchi et al., (15 and 16)). The virus exhibits somesimilarities with the group comprising Flavi- and Pestiviruses; however,it appears to form a distinct class, different from viruses known upuntil now (Miller and Purcell, (10)).

In spite of the breakthrough which the cloning of HCV represented,several problems persist:

a substantial genetic variability exists in certain regions of the viruswhich has made it possible to describe the existence of two groups ofviruses,

diagnosis of the viral infection remains difficult in spite of thepossibility of detecting anti-HCV antibodies in the serum of patients.This is due to the existence of false positive results and to a delayedseroconversion following acute infection. Finally there are clearlycases where only the detection of the virus RNA makes it possible todetect the HCV infection while the serology remains negative.

These problems have important implications both with respect todiagnosis and protection against the virus.

The authors of the present invention have carried out the cloning andobtained the partial nucleotide sequence of a French isolate of HCV(called hereinafter HCV E1) from a blood donor who transmitted an activechronic hepatitis to a recipient. Comparison of the nucleotide sequencesand the peptide sequences obtained with the respective sequences of theAmerican and Japanese isolates showed that there was

a high conservation of nucleic acids in the noncoding region of HCV E1,

a high genetic variability in the structural regions called E1 andE2/NS1,

a smaller genetic variability in the nonstructural region.

The present invention is based on new nucleotide and polypeptidesequences of the hepatitis C virus which have not been described in theabovementioned state of the art.

The subject of the present invention is thus a DNA sequence of HCV E1comprising a DNA sequence chosen from the nucleotide sequences of atleast 10 nucleotides between the following nucleotides (n); n₁₁₈ to n₁₃₈; n₁₇₇ to n₂₀₂ ; n₂₃₃ to n₂₄₇ ; n₂₅₄ to n272 and n₂₇₂ to n₂₈₈represented in the sequence SEQ ID NO:2, and, n₁₅₆ to n₁₇₀ ; n₁₇₀ ton₂₁₇ ; n₂₆₇ to n₂₆₃ and n₃₁₀ to n₃₃₄ represented in the sequence SEQ IDNO:4; as well as analogous nucleotide sequences resulting fromdegeneracy of the genetic code.

The subject of the invention is in particular the following nucleotidesequences: SEQ ID NO:2, SEQ IS NO:4 and SEQ ID NO:6.

The oligonucleotide sequences may be advantageously synthesised by theApplied Bio System technique.

The subject of the invention is also a peptide sequence of HCV E1comprising a peptide sequence chosen from the sequences of at least 7amino acids between the following amino acids (aa): aa₅₈ to aa₆₆ ; aa₇₆to aa₁₀₁ represented in the peptide sequence SEQ ID NO:3; aa₄₉ to aa₇₈ ;aa₉₈ to aa₁₁₁ ; aa₁₂₃ to aa₁₃₃ ; aa₁₄₀ to aa₁₄₉ represented in thepeptide sequence SEQ ID NO:5; as well as homologous peptide sequenceswhich do not induce modification of biological and immunologicalproperties.

Preferably, the peptide sequence is chosen from the following amino acidsequences: aa₅₈ to aa66; aa₇₆ to aa₁₀₁, represented in the peptidesequence SEQ ID NO:3; aa₄₉ to aa₇₈ ; aa₉₈ to aa₁₁₁ ; aa₁₂₃ to aa₁₃₃ andaa₁₄₀ to aa₁₄₉ represented in the peptide sequence SEQ ID NO:5.

Moreover, the peptide sequence is advantageously chosen from the peptidesequences SEQ ID NO:3, SEQ ID NO:5 and SEQ ID NO:7.

The subject of the invention is also a nucleotide sequence encoding apeptide sequence as defined above.

Moreover, the subject of the invention is a polynucleotide probecomprising a DNA sequence as defined above.

The subject of the invention is also an immunogenic peptide comprising apeptide sequence as defined above.

The peptide sequences according to the invention can be obtained byconventional methods of synthesis or by the application of geneticengineering techniques comprising the insertion of a DNA sequence,encoding a peptide sequence according to the invention, into anexpression vector such as a plasmid and the transformation of cellsusing this expression vector and the culture of these cells.

The subject of the invention is also plasmids or expression vectorscomprising a DNA sequence encoding a peptide sequence as defined aboveas well as hosts transformed using this vector.

The preferred plasmids are those deposited with CNCM on 5 Jun. 1991under the numbers I-1105, I-1106 and I-1107.

The subject of the invention is also monoclonal antibodies directedagainst a peptide sequence according to the invention or an immunogenicsequence of such a polypeptide.

The monoclonal antibodies according to the invention can be preparedaccording to a conventional technique. For this purpose, thepolypeptides may be coupled, if necessary, to an immunogenic agent suchas tetanus anatoxin using a coupling agent such as glutaraldehyde, acarbodiimide or a bisdiazotised benzidine.

The present invention also encompasses the fragments and the derivativesof monoclonal antibodies according to the invention. These fragments areespecially F(ab')₂ fragments which can be obtained by enzymatic cleavageof the antibody molecules with pepsin, the Fab' fragments which can beobtained by reducing the disulphide bridges of the F(ab')₂ fragments,and the Fab fragments which can be obtained by enzymatic cleavage of theantibody molecules with papain in the presence of a reducing agent.These fragments, as well as the Fc fragments, can also be obtained bygenetic engineering.

The derivatives of monoclonal antibodies are for example antibodies orfragments of these antibodies to which markers, such as a radioisotopes,are attached. The derivatives of monoclonal antibodies are alsoantibodies or fragments of these antibodies to which therapeuticallyactive molecules are attached.

The subject of the invention is also an analytical kit for the detectionof nucleotide sequences specific to the HVC E1 strain, comprising one ormore probes as defined above.

The subject of the present invention is also an in vitro diagnosticprocess involving the detection of antigens specific to HCV E1, in abiological sample possibly containing the said antigens, in which, thebiological sample is exposed to an antibody or an antibody fragment, asdefined above; as well as a diagnostic kit for carrying out the process.

The subject of the invention is also an in vitro diagnostic processinvolving the detection of antibodies specific to HCV E1 in a biologicalsample possibly containing the said antibodies, in which a biologicalsample is exposed to an antigen containing an epitope corresponding to apeptide sequence, as well as a diagnostic kit for the detection ofspecific antibodies, comprising an antigen containing an epitopecorresponding to a peptide sequence as defined above.

These procedures may be based on a radioimmunological method of the RIA,RIPA or IRMA type or an immunoenzymatic method of the WESTERN-BLOT typecarried out on strips or of the ELISA type.

The subject of the invention is also a therapeutic compositioncomprising monoclonal antibodies or fragments of monoclonal antibodiesor derivatives of monoclonal antibodies as defined above.

Advantageously, the monoclonal antibody derivatives are monoclonalantibodies or fragments of these antibodies attached to atherapeutically active molecule.

The subject of the invention is also an immunogenic compositioncontaining an immunogenic sequence as defined above, optionally attachedto a carrier protein, the said immunogenic sequence being capable ofinducing protective antibodies or cytotoxic T lymphocytes. Anatoxinssuch as tetanus anatoxin may be used as carrier protein. Alternatively,immunogens produced according to the MAP (Multiple Antigenic Peptide)technique may also be used.

In addition to the immunogenic peptide sequence, the immunogeniccomposition may contain an adjuvant possessing immunostimulantproperties.

The following are among the adjuvants which may be used: inorganic saltssuch as aluminium hydroxide, hydrophobic compounds or surface-activeagents such as incomplete Freund's adjuvant, squalene or liposomes,synthetic polynucleotides, microorganisms or microbial components suchas murabutide, synthetic artificial molecules such as imuthiol orlevamisole, or alternatively cytokines such as interferons α, β, γ orinterleukins.

The subject of the invention is also a process for assaying a peptidesequence as defined above, comprising the use of monoclonal antibodiesdirected against this peptide sequence.

The subject of the invention is also a process for preparing a peptidesequence as defined above, comprising the insertion of a DNA sequence,encoding the peptide sequence, into an expression vector, thetransformation of cells using this expression vector and the culture ofthe cells.

The production of the DNA of the sequences of the HCV E1 strain will bedescribed below in greater detail with reference to the accompanyingfigures in which:

FIG. 1 represents the location of the amplified and sequenced HCV E1regions;

FIG. 2 represents the comparison of the nucleotide sequence of HCV E1(1) SEQ ID NO:1!, in the non-coding region, with the sequences of anAmerican isolate (2) SEQ ID NO:24! and two Japanese isolates: HCJ1 (3)SEQ ID NO:25! and HCJ4 (4) SEQ ID NO:26! respectively described inWO-A-90/14436 and by Okamoto et al. (12);

FIG. 3 represents the comparison of the nucleotide sequence of HCV E1(1) SEQ ID NO:3!, in the region E1, with the sequences of an Americanisolate (HCVpt) (2) SEQ ID NO:27! described in WO 90/14436 and threeJapanese isolates: HCVJ-1 (3) SEQ ID NO:28!, HCJ1 (4) SEQ ID NO:29! andHCJ4 (5) SEQ ID NO:30! described in Takeuchi et al. (15); Okamoto et al.(12);

FIG. 4 represents the comparison of the aminoacid sequence, in theregion E1, of HCV E1 (1) SEQ ID NO:3! with the American isolate HCVpt(2) SEQ ID NO:31! and the Japanese isolates: HCVJ1 (3) SEQ ID NO:32!,HCJ1 (4) SEQ ID NO:33! and HCJ4 (5) SEQ ID NO:34!; the variable regionsare boxed;

FIG. 5 represents the comparison of the nucleotide sequence, in theregion E2/NS1, of HCV E1 (1) SEQ ID NO:4! with the American isolateHCVpt (2) SEQ ID NO:35! described in WO-A-90/14436 and the Japaneseisolates HCJ1 (3) SEQ ID NO:36!, HCJ4 (4) SEQ ID NO:37! and HCVJ1 (5)SEQ ID NO:38! described by Okamoto et al. (12); Takeuchi et al. (15);

FIG. 6 represents a comparison of the aminoacid sequence, in the regionE2/NS1, of HCV E1 (1) SEQ ID NO:5! with the American isolate HCVpt (2)SEQ ID NO:39! and the Japanese isolates HCJ1 (3) SEQ ID NO:40!, HCJ4 (4)SEQ ID NO:41! and HCVJ1 (5) SEQ ID NO:42!; the variable regions areboxed;

FIG. 7 represents the hydrophilicity profile of HCV E1 in the regionE2/NS1; the hydrophobic regions are located under the middle line;

FIG. 8 represents the comparison of the nucleotide sequence, in theregion NS3/NS4, of HCV E1 (1) SEQ ID NO:6! with the American isolateHCVpt (2) SEQ ID NO:43! described in WO-A-90/14436 and the Japaneseisolate HCVJ1 (3) SEQ ID NO:44! described by Kubo et al. (7);

FIG. 9 represents the comparison of the aminoacid sequence, in theregion NS3/NS4, of HCV E1 (1) SEQ ID NO:6! with the American isolateHCVpt (2) SEQ ID NO:45! and the Japanese isolate HCVJ1 (3) SEQ IDNO:46!.

I- Preparation of the Nucleotide Sequences

1) Preparation of the HCV E1 RNA

The HCV E1 RNA was prepared as previously described in EP-A-0,318,216from the serum of a French blood donor suffering from a chronichepatitis, anti-HCV positive (anti-C100) (Kubo et al. (7)).

100 μl of serum were diluted in a final volume of 1 ml, in the followingextraction buffer: 50 mM tris-HCl, pH.8, 1 mM EDTA, 100 mM NaCl, 1 mg/mlof proteinase K, and 0.5% SDS. After digestion with proteinase K for 1 hat 37° C., the proteins were extracted with one volume of TE-saturatedphenol (10 mM Tris-HCl, pH.8, 1 mM EDTA). The aqueous phase was thenextracted twice with one volume of phenol/chloroform (1:1) and once withone volume of chloroform. The aqueous phase was then adjusted to a finalconcentration of 0.2M sodium acetate and the nucleic acids wereprecipitated by the addition of two volumes of ethanol. Aftercentrifugation, the nucleic acids were suspended in 30 μl ofDEPC-treated sterile distilled water.

2) Reverse Transcription and Amplification

A complementary DNA (cDNA) was synthesised using as primer eitheroligonucleotides specific to HCV, represented in Table I below, or amixture of hexanucleotides not specific to HCV, and murine reversetranscriptase. A PCR (Polymerase Chain Reaction) was carried out over 40cycles at the following temperatures: 94° C. (1 min), 55° C. (1 min),72° C. (1 min), on the cDNA thus obtained, using pairs of primersspecific to HCV (Table I below). Various HCV primers were made from thesequence of HCV prototype (HCVpt), isolated from a chronically infectedchimpanzee (Bradley et al. (2); Alter et al. (1), EP-A-0,318,216). Thenucleotide sequence of the 5' region of the E2/NS1 gene was obtainedusing a strategy derived from the sequence-independent single primeramplification technique (SISPA) described by Reyes et al. (13). Itconsists in ligating double-stranded adaptors to the ends of the DNAsynthesised using an HCV-specific primer localised in 5' of the HCVptsequence (primer NS1A in Table I). A semi-specific amplification is thencarried out using an HCV-specific primer as well as a primercorresponding to the adaptor. This approach makes it possible to obtainamplification products spanning the 5' region of the primer used for thesynthesis of the cDNA.

                                      TABLE I    __________________________________________________________________________    Sequence of the primers and probes.    __________________________________________________________________________    a) Primers.sup.a :    NS3  (+) 5' ACAATACGTGTGTCACC (3013-3029)  SEQ ID NO:8!    NS4  (-) 5' AAGTTCCACATATGCTTCGC (3955-3935)  SEQ ID NO:9!    NS1A (-) 5' TCCGTTGGCATAACTGATAG (83-64)  SEQ ID NO:10!    NS1B (+) 5' CTATCAGTTATGCCAACGGA (64-83)  SEQ ID NO:11!    NS1C (-) 5' GTTGCCCGCCCCTCCGATGT (380-361)  SEQ ID NO:12!    NS1D (+) 5' CCCAGCCCCGTGGTGGTGGG (183-202)  SEQ ID NO:13!    NS1E (-) 5' CCACAAGCAGGAGCAGACGC (860-841)  SEQ ID NO:14!    NCA  (+) 5' CCATGGCGTTAGTATGAGT (-259--239)  SEQ ID NO:15!    NCB  (-) 5' GCAGGTCTACGAGACCTC (-4--23)  SEQ ID NO:16!    E1A  (+) 5' TTCTGGAAGACGGCGTGAAC (470-489)  SEQ ID NO:17!    E1B  (-) 5' TCATCATATCCCATGCCATG (973-954)  SEQ ID NO:18!    b) probes.sup.a :    NS3/NS4         (+) 5'CCTTCACCATTGAGACAATCACGCTCCCCCAGGATGCTGT (3058-3097)  SEQ ID         NO:19!    NS1  (+) 5'CTGTCCTGAGAGGCTAGCCAGCTGCCGACCCCTTACCGAT (5-44)  SEQ ID         NO:20!    NS1B/C         (+) 5'AGGTCGGGCGCGCCCACCTACAGCTGGGGTGAAAATGATA (210-248)  SEQ ID         NO:21!    NC   (+) 5'GTGCAGCCTCCAGGACCCCC (235--216)  SEQ ID NO:22!    E1   (-) 5'CTCGTACACAATACTCGAGT (646-627)  SEQ ID NO:23!    __________________________________________________________________________     .sup.a The nucleotide sequences and their locations correspond to the HCV     prototype (HCVpt) (EPA-0, 318, 216 and WOA-90/14436).

3) Cloning and Sequencing

The amplification products were cloned into M13 mp19 or into thebacteriophage lambda gt 10 as described by Thiers et al. (17). Theprobes used for screening the DNA sequences are represented in Table Iabove. The nucleotide sequence of the inserts was determined by thedideoxynucleotide-based method described by Sanger et al., (14).

II-Study of the Nucleotide Sequences of the French Isolate (HCV E1)

The location of the various amplification products which made itpossible to obtain the nucleotide sequence of the HCV E1 isolate innonstructural and structural regions as well as in the noncoding regionof the virus, is schematically represented in FIG. 1.

1) Nucleotide Sequence of HCV E1 in the Noncoding 5' Region

The amplified and sequenced noncoding 5' region of HCV E1 is called IDSEQ No.1. It corresponds to a 256-base pair (bp) fragment located inposition -259 to -4 in HCVpt as described in WO-A-90/14436. Comparisonof the HCV E1 sequence with those previously published shows a very highnucleic acid conservation (FIG. 2).

2) Nucleotide and Peptide Sequences of HCV E1 in the Structural Region

The nucleotide sequences probably correspond to two regions encoding thevirus envelope proteins (currently designated as the E1 and E2/NS1regions).

For the E1 region, the sequence obtained for HCV E1 corresponds to the3' moiety of the gene. It has been called ID SEQ No.2. This 501-bpsequence is located in position 470 and 973 in the HCVpt sequence asdescribed in WO-A-90/14436. Comparison of this sequence with thosepreviously described shows a high genetic variability (FIG. 3). Indeed,depending on the isolates studied, a difference of 10 to 27% in nucleicacid composition and 7 to 20% in amino acid composition may be observedas shown in Table II below. Furthermore, comparison of the peptidesequence reveals the existence of two hypervariable regions which areboxed in FIG. 4.

For the E2/NS1 region, the HVC E1 sequence data were obtained from threeoverlapping amplification products (FIG. 1). The consensus sequence thusobtained (1210 bp) contains the entire E2/NS1 gene and was called ID SEQNo.3. The sequence of the E2/NS1 region of HCV E1 is situated inposition 999 and 2209 compared with the HCVpt sequence described inWO-A-90/14436. Comparison of the HCV E1 sequences with the isolatespreviously described shows a difference of 13 to 33% in the case ofnucleic acids and 11 to 30% in the case of amino acids (FIG. 5 and 6,Table II). The highest variability is observed in 5' of the E2/NS1 gene(FIG. 5). Comparison of amino acids shows the existence of fourhypervariable regions which are boxed in FIG. 6. The hydrophilicityprofile of the E2/NS1 region (Kyte and Dolittle, (9)) is given in FIG.7. A hydrophilic region flanked by two hydrophobic regions are observed.Both hydrophobic regions probably correspond to the signal sequence aswell as to the transmembrane segment. Finally, the central region hasten potential glycolisation sic! sites (N-X-T/S), which are conserved inthe various isolates (FIG. 6).

3) Nucelotide and Peptide Sequence of HCV E1 in the Nonstructural Region

The sequence data for HCV E1 in the nonstructural region correspond tothe 3' and 5' terminal parts of the NS3 and NS4 genes respectively (FIG.1). The sequence obtained for HCV E1 (943 bp) is located in position4361 to 5303 in the HCVpt sequence and was called ID SEQ No.4. Thesequence homology is 95% with the HCVpt isolate and 78.6% with aJapanese isolate (FIG. 8, Table II above). In the case of the comparisonof amino acids, a homology of 98% and 93% was observed with the HCVptand Japanese isolates respectively (FIG. 8, Table II above).

Thus, comparison of the nucleotide sequence of the HCV E1 isolate withthat of the American and Japanese isolates shows that the French isolateis different from the isolates described above. It reveals the existenceof highly variable regions in the envelope proteins. The variability ofthe nonstructural region studied is lower. Finally, the noncoding 5'region shows a high conservation.

These results have implications both for diagnosis and prevention ofHVC.

As far as diagnosis is concerned, definition of the hypervariableregions and of the conserved regions can lead to:

the definition of synthetic peptides which allow the expression ofepitopes specific to the various HCV groups.

For the envelope protein E1, peptides for the determination oftype-specific epitopes are advantageously defined in a region betweenamino acids 75 to 100 (FIG. 4). Likewise, for the protein E2/NS1,peptides allow sic! characterisation of specific epitopes aresynthesised in regions preferably between amino acids 50 and 149, (FIG.6).

The expression of all or part of the cloned sequences, in particularclones corresponding to the envelope regions of the virus, make itpossible to obtain new antigens for the development of diagnosticreagents and for the production of immunogenic compositions. Finally,the preparation of a substantial part of the nucleotide sequence of thisisolate allows the production of the entire length of complementary DNAwhich can be used for a better understanding of the mechanisms of theviral infection and also for diagnostic and preventive purposes.

                  TABLE II    ______________________________________    Difference in nucleic acids (n.a.) and amino    acids (a.a.) between the Frech isolate    (HCV E1) and the American (HCVpt) and japanese    (HCVJ1, HCJ1, HCJ4) isolates.                HCVpt HCVJ1    HCJ1    HCJ4    ______________________________________    HCE1 E1    n.a.   10.6    27.3   10.4  26.5               a.a.   7.2     19.9   8.4   20.5    HCE1 E2/NS1               n.a.   12.8    33.2%  14.5% 29.8%               a.a.   12.2%   29.7%  15.6% 26.1%    HCVE1 NS3/NS4               n.a.   5.2%    21.4%  --    --               a.a.   2.2%    6.9%   --    --    ______________________________________

REFERENCES

1. Alter, H. J., Purcell, R. H., Shib, J. W., Melpolder, J. C.,Houghton, M., Choo, Q. -L. & Kuo, G. (1989). Detection of antibody tohepatitis C virus in prospectively followed transfusion recipients withacute and chronic Non-A, Non-B hepatitis. New England Journal ofMedicine 321, 1494-1500.

2. Bradley, D. W., Cook, E. H., Maynard, J. E., McCaustland, K. A.,Ebert, J. W., Dolana, G. H., Petzel, R. A., Kantor, R. J., Heilbrunn,A., Fields, H. A. & Murphy, B. L. (1979). Experimental infection ofchimpanzees with antihemophilic (factor VIII) materials: recovery ofvirus-like particles associated with Non-A, Non-B hepatitis. Journal ofMedical Virology 3, 253-269.

3. Choo, Q. -L., Kuo, G., Weiner, A. J., Overby, L. R., Bradley, D. W. &Houghton, M. (1989). Isolation of a cDNA clone derived from ablood-borne Non-A, Non-B viral hepatitis genome. Science 244, 359-362.

4. Enomoto, N., Takada, A., Nakao, T. & Date, T. (1990). There are twomajor types of hepatitis C virus in Japan. Biochemical and BiophysicalResearch Communications 170, 1021-1025.

5. Hopf, U., Moller, B., Kuther, D., Stemerowicz, R., Lobeck, H.,Ludtke-Handjery, A., Walter, E., Blum, H. E., Roggendorf, M. &Deinhardt, F. (1990). Long-term follow-up of post transfusion andsporadic chronic hepatitis Non-A, Non-B and frequency of circulatingantibodies to hepatitis C virus (HCV). Journal of Hepatology 10, 69-76.

6. Kato, N., Hijakata, M., Ootsuyama, Y., Nakagawa, M., Ohkoshi, S.,Sugimura, T. & Shimotohno, K. (1990). Molecular cloning of the humanhapatitis C virus genome from Japanese patients with Non-A, Non-Bhepatitis. Proceedings of the National Academy of Sciences, U.S.A. 87,9524-9528.

7. Kubo, Y., Takeuchi, K., Boonmar, S., Katayama, T., Choo, Q. -L., Kuo,G., Weiner, A. J., Bradley D. W., Houghton, M., Saito, I. & Miyamura, T.(1989). A cDNA fragment of hepatitis C virus isolated from an implicateddonor of post-transfusion Non-A, Non-B hepatitis in Japan. Nucleic AcidsResearch 17, 10367-10372.

8. Kuo, G., Choo, Q. -L., Alter, H. J., Gitnick, G. L., Redeker, A. G.,Purcell, R. H., Miyamura, T., Dienstag, J. L., Alter, M. J., Stevens, C.E., Tegtmeier, G. E., Bonino, F., Colombo, M., Lee, W. S., Kuo, C.,Berger, K., Shuster, J. R., Overby, L. R., Bradley, D. W. & Houghton, M.(1989). An assay for circulating antibodies to a major etiologic virusof human Non-A, Non-B hepatitis. Science 244, 362-364.

9. Kyte, W. & Doolittle, R. F. (1982). A simple method for displayingthe hydropathic of a protein. Journal of Molecular Biology 157, 105-132.

10. Miller, R. H. & Purcell, R. H. (1990). Hepatitis C virus sharesamino acid sequence similarity with pestiviruses and flaviviruses aswell as members of two plant virus super groups. Proceedings of theNational Academy of Sciences, U.S.A. 87, 2057-2061.

11. Miyamura, T., Saito, T., Katayama, T., Kikuchi, S., Tateda, A.,Houghton, M., Choo, Q. -L. & Kuo, G. (1990). Detection of antibodyagainst antigen expressed by molecularly cloned hepatitis C virus cDNA:application to diagnosis and blood screening for posttransfusionhepatitis. Proceedings of the National Academy of Sciences, U.S.A. 87,983-987.

12. Okamoto, H., Okada, S., Sugiyama, Y., Yotsumoto, S., Tanaka, T.,Yoshizawa, H., Tsuda, F., Miyakawa, Y. & Mayumi, M. (1990). The 5'terminal sequence of the hepatitis C virus genome. Japanese Journal ofExperimental Medicine 60, 167-177.

13. Reyes, G. R., Purdy, M. A., Kim, J. P., Luk, K. -C., Young, L. M.,Fry, K. E. & Bradley, D. W. (1990). Isolation of a cDNA from the virusresponsible for enterically transmitted Non-A, Non-B hepatitis. Science247, 1335-1339.

14. Sanger, F. S., Nicklen, S. & Coulsen, A. R. (1977). DNA sequencingwith chain terminating inhibition. Proceedings of the National Academyof Sciences, U.S.A. 74, 5463-5467.

15. Takeuchi, K., Boonmar, S., Kubo, Y., Katayama, T., Harada, H.,Ohbayashi, A., Choo, Q., -L., Houghton, M., Saito, I. & Miyamura, T.(1990a). Hepatitis C viral cDNA clones isolated from a healthy carrierdonor implicated in post-transfusion Non-A, Non-B hepatitis. Gene 91(2), 287-291.

16. Takeuchi, K., Kubo, Y., Boonmar, S., Watanabe, Y., Katayama, T.,Choo, Q. -L., Kuo, G., Houghton, M., Saito, I. & Miyamura, T. (1990b).Nucleotide sequence of core and envelope genes of the hepatitis C virusgenome derived directly from human healthy carriers. Nucleic AcidsResearch 18, 4626.

17. Thiers, V., Nakajima, E. N., Kremsdorf, D., Mack, D., Schellekens,H., Driss, F., Goude, A., Wands, J., Sninsky, J., Tiollais, P. &Brechot, C. (1988). Transmission of hepatitis B from hepatitis Bseronegative subjects. Lancet ii, 1273-1276

    ______________________________________    Symbols for the amino acids    ______________________________________    A            Ala         alanine    C            Cys         cysteine    D            Asp         aspartic acid    E            Glu         glutamic acid    F            Phe         phenylalanine    G            Gly         glycine    H            His         histidine    I            Ile         isoleucine    K            Lys         lysine    L            Leu         leucine    M            Met         methionine    N            Asn         asparagine    P            Pro         proline    Q            Gln         glutamine    R            Arg         arginine    S            Ser         serine    T            Thr         threonine    V            Val         valine    W            Trp         tryptophan    Y            Tyr         tyrosine    ______________________________________

    __________________________________________________________________________    SEQUENCE LISTING    (1) GENERAL INFORMATION:    (iii) NUMBER OF SEQUENCES: 46    (2) INFORMATION FOR SEQ ID NO:1:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 256 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:    CCATGGCGTTAGTATGAGTGTCGTACAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCATA60    GTGGTCTGCGGAGCCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGA120    TCAACCCGCTCAATGCCTGGAGATTTGGGCGTGCCCCCGCAAGACTGCTAGCCGAGTAGT180    GTTGGGTCGCGAAAGGCCTTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAG240    GTCTCGTAGACCGTGC256    (2) INFORMATION FOR SEQ ID NO:2:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 501 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:    TTCTGGAAGACGGCGTGAACTATGCAACAGGGAACCTTCCTGGTTGCTCTTTCTCTATCC60    TCCTCCTGGCCCTGCTCTCTTGCCTGACTGTGCCCGCGTCAGCCTACCAAGTACGCAATT120    CTCGCGGCCTTTACCATGTCACCAATGATTGCCCTAACTCGAGTATTGTGTACGAGACGG180    CCGATAGCATTCTACACTCTCCGGGGTGTGTCCCTTGCGTTCGCGAGGGTAACACCTCGA240    AATGTTGGGTGGCGGTGGCCCCTACAGTCGCCACCAGAGACGGCAGACTCCCCACAACGC300    AGCTTCGACGTCATATCGATCTGCTCGTCGGGAGCGCCACCCTCTGCTCGGCCCTCTATG360    TGGGGGACTTGTGCGGGTCCGTCTTCCTCGTCGGTCAATTGTTCACCTTCTCCCCCAGGC420    GCCACTGGACAACGCAAGACTGCAACTGTTCCATCTACCCCGGCCACGTAACGGGTCACC480    GCATGGCATGGGATATGATGA501    (2) INFORMATION FOR SEQ ID NO:3:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 166 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: peptide    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:    LeuGluAspGlyValAsnTyrAlaThrGlyAsnLeuProGlyCysSer    151015    PheSerIleLeuLeuLeuAlaLeuLeuSerCysLeuThrValProAla    202530    SerAlaTyrGlnValArgAsnSerArgGlyLeuTyrHisValThrAsn    354045    AspCysProAsnSerSerIleValTyrGluThrAlaAspSerIleLeu    505560    HisSerProGlyCysValProCysValArgGluGlyAsnThrSerLys    65707580    CysTrpValAlaValAlaProThrValAlaThrArgAspGlyArgLeu    859095    ProThrThrGlnLeuArgArgHisIleAspLeuLeuValGlySerAla    100105110    ThrLeuCysSerAlaLeuTyrValGlyAspLeuCysGlySerValPhe    115120125    LeuValGlyGlnLeuPheThrPheSerProArgArgHisTrpThrThr    130135140    GlnAspCysAsnCysSerIleTyrProGlyHisValThrGlyHisArg    145150155160    MetAlaTrpAspMetMet    165    (2) INFORMATION FOR SEQ ID NO:4:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 1210 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:    AATGGCTCAACTGCTCAGGGTCCCGCAAGCCATCTTGGACATGATCGCTGGTGCCCACTG60    GGGAGTCCTAGCGGGCATAGCGTATTTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGCT120    AGTGCTGTTGCTGTTCGCCGGCGTCGATGCGGAAACCTACACCACCGGGGGGAGTACTGC180    CAGGACCACGCAAGGACTCGTCAGCCTTTTCAGTCGAGGCGCCAAGCAGGACATCCAGCT240    GATCAACACCAACGGCAGCTGGCACATTAATCGCACAGCTTTGAACTGTAATGAGAGCCT300    CGACACCGGCTGGGTAGCGGGGCTCTTCTATTACCACAAATTCAACTCTTCAGGCTGCCC360    CGAGAGGATGGCCAGCTGCAGACCCCTTGCCGATTTCGACCAGGGCTGGGGCCCTATCAG420    TTATGCCAACGGAACCGGCCCTGAACACCGCCCCTACTGCTGGCACTACCCCCCAAAGCC480    TTGTGGTATCGTGCCAGCACAGACCGTATGTGGCCCAGTGTATTGCTTCACTCCTAGCCC540    CGTGGTGGTGGGGACGACCAATAAGTTGGGCGCACCCACTTACAACTGGGGTTGTAATGA600    TACGGACGTCTTCGTCCTTAATAACACCAGGCCACCGCTGGGCAATTGGTTCGGCTGCAC660    CTGGGTGAACTCATCTGGATTTACTAAAGTGTGCGGAGCGCCTCCCTGTGTCATCGGAGG720    AGCGGGCAATAACACCTTGTACTGCCCCACTGACTGTTTCCGCAAGCATCCGGAAGCTAC780    ATACTCCCGATGTGGCTCCGGTCCTTGGATCACGCCCAGGTGCCTGGTTGGCTATCCTTA840    TAGGCTCTGGCATTATCCCTGTACTGTCAACTACACCCTGTTCAAGGTCAGGATGTACGT900    GGGAGGGGTCGAGCACAGGCTGCAAGTCGCTTGCAACTGGACGCGGGGCGAGCGTTGTAA960    TCTGGACGACAGGGACAGGTCCGAGCTCAGTCCGCTGCTGCTGTCTACCACACAGTGGCA1020    GGTCCTCCCGTGTTCCTTTACGACCTTGCCAGCCTTGACTACCGGCCTCATCCACCTCCA1080    CCAGAACATCGTGGACGTGCAATATTTGTACGGGGTGGGGTCAAGCATTGTGTCCTGGGC1140    CATCAAGTGGGAGTACGTCATTCTCCTGTTTCTCCTGCTTGCAGACGCGCGCGTCTGCTC1200    CTGCTTGTGG1210    (2) INFORMATION FOR SEQ ID NO:5:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 403 amino acids    (B) TYPE: amino acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: peptide    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:    MetAlaGlnLeuLeuArgValProGlnAlaIleLeuAspMetIleAla    151015    GlyAlaHisTrpGlyValLeuAlaGlyIleAlaTyrPheSerMetVal    202530    GlyAsnTrpAlaLysValLeuLeuValLeuLeuLeuPheAlaGlyVal    354045    AspAlaGluThrTyrThrThrGlyGlySerThrAlaArgThrThrGln    505560    GlyLeuValSerLeuPheSerArgGlyAlaLysGlnAspIleGlnLeu    65707580    IleAsnThrAsnGlySerTrpHisIleAsnArgThrAlaLeuAsnCys    859095    AsnGluSerLeuAspThrGlyTrpValAlaGlyLeuPheTyrTyrHis    100105110    LysPheAsnSerSerGlyCysProGluArgMetAlaSerCysArgPro    115120125    LeuAlaAspPheAspGlnGlyTrpGlyProIleSerTyrAlaAsnGly    130135140    ThrGlyProGluHisArgProTyrCysTrpHisTyrProProLysPro    145150155160    CysGlyIleValProAlaGlnThrValCysGlyProValTyrCysPhe    165170175    ThrProSerProValValValGlyThrThrAsnLysLeuGlyAlaPro    180185190    ThrTyrAsnTrpGlyCysAsnAspThrAspValPheValLeuAsnAsn    195200205    ThrArgProProLeuGlyAsnTrpPheGlyCysThrTrpValAsnSer    210215220    SerGlyPheThrLysValCysGlyAlaProProCysValIleGlyGly    225230235240    AlaGlyAsnAsnThrLeuTyrCysProThrAspCysPheArgLysHis    245250255    ProGluAlaThrTyrSerArgCysGlySerGlyProTrpIleThrPro    260265270    ArgCysLeuValGlyTyrProTyrArgLeuTrpHisTyrProCysThr    275280285    ValAsnTyrThrLeuPheLysValArgMetTyrValGlyGlyValGlu    290295300    HisArgLeuGlnValAlaCysAsnTrpThrArgGlyGluArgCysAsn    305310315320    LeuAspAspArgAspArgSerGluLeuSerProLeuLeuLeuSerThr    325330335    ThrGlnTrpGlnValLeuProCysSerPheThrThrLeuProAlaLeu    340345350    ThrThrGlyLeuIleHisLeuHisGlnAsnIleValAspValGlnTyr    355360365    LeuTyrGlyValGlySerSerIleValSerTrpAlaIleLysTrpGlu    370375380    TyrValIleLeuLeuPheLeuLeuLeuAlaAspAlaArgValCysSer    385390395400    CysLeuTrp    (2) INFORMATION FOR SEQ ID NO:6:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 943 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:    ACAATACGTGTGTCACCCAGACAGTCGACTTCAGCCTTGACCCTACCTTCACCATTGAAA60    CAACAACGCTTCCCCAGGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGG120    GGAAGCCAGGCATTTACAGATTTGTGGCACCTGGAGAGCGCCCCTCCGGCATGTTCGACT180    CGTCCGTCCTCTGCGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCCG240    AGACCACAGTCAGGCTACGAGCATACATGAACACCCCGGGACTTCCCGTGTGCCAAGACC300    ATCTTGAGTTTTGGGAGGGCGTCTTCACGGGTCTCACCCATATAGACGCCCACTTCCTAT360    CCCAGACAAAGCAGAGTGGGGAAAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGT420    GCGCTAGGGCCCAAGCCCCTCCCCCGTCGTGGGACCAGATGTGGAAGTGCTTGATTCGTC480    TCAAGCCCACCCTCCATGGGCCAACACCCCTGCTATACCGACTGGGCGCTGTTCAGAATG540    AAGTCACCCTGACGCACCCAATCACCAAATATATCATGACATGCATGTCGGCTGACCTGG600    AGGTCGTCACGAGTACCTGGGTGCTCGTGGGCGGCGTTCTGGCTGCTTTGGCCGCGTATT660    GCCTATCCACAGGCTGCGTGGTCATAGTAGGCAGGGTCATTTTGTCCGGGAAGCCGGCAA720    TCATACCCGACAGGGAAGTCCTCTACCGGGAGTTCGATGAGATGGAAGAGTGCTCTCAGC780    ACTTGCCATACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCG840    GCCTCCTGCAAACACGGTCCCGCCAGGCAGAGGTCATCACCCCTGCTGTCCAGACCAACT900    GGCAGAGACTCGAGGCCTTCTGGGCGAAGCATATGTGGAACTT943    (2) INFORMATION FOR SEQ ID NO:7:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 313 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: peptide    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:    AsnThrCysValThrGlnThrValAspPheSerLeuAspProThrPhe    151015    ThrIleGluThrThrThrLeuProGlnAspAlaValSerArgThrGln    202530    ArgArgGlyArgThrGlyArgGlyLysProGlyIleTyrArgPheVal    354045    AlaProGlyGluArgProSerGlyMetPheAspSerSerValLeuCys    505560    GluCysTyrAspAlaGlyCysAlaTrpTyrGluLeuThrProAlaGlu    65707580    ThrThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProVal    859095    CysGlnAspHisLeuGluPheTrpGluGlyValPheThrGlyLeuThr    100105110    HisIleAspAlaHisPheLeuSerGlnThrLysGlnSerGlyGluAsn    115120125    LeuProTyrLeuValAlaTyrGlnAlaThrValCysAlaArgAlaGln    130135140    AlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeu    145150155160    LysProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAla    165170175    ValGlnAsnGluValThrLeuThrHisProIleThrLysTyrIleMet    180185190    ThrCysMetSerAlaAspLeuGluValValThrSerThrTrpValLeu    195200205    ValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCysLeuSerThrGly    210215220    CysValValIleValGlyArgValIleLeuSerGlyLysProAlaIle    225230235240    IleProAspArgGluValLeuTyrArgGluPheAspGluMetGluGlu    245250255    CysSerGlnHisLeuProTyrIleGluGlnGlyMetMetLeuAlaGlu    260265270    GlnPheLysGlnLysAlaLeuGlyLeuLeuGlnThrArgSerArgGln    275280285    AlaGluValIleThrProAlaValGlnThrAsnTrpGlnArgLeuGlu    290295300    AlaPheTrpAlaLysHisMetTrpAsn    305310    (2) INFORMATION FOR SEQ ID NO:8:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 17 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: DNA primer    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:    ACAATACGTGTGTCACC17    (2) INFORMATION FOR SEQ ID NO:9:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 20 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: DNA primer    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:    AAGTTCCACATATGCTTCGC20    (2) INFORMATION FOR SEQ ID NO:10:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 20 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: DNA primer    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:    TCCGTTGGCATAACTGATAG20    (2) INFORMATION FOR SEQ ID NO:11:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 20 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: DNA primer    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:    CTATCAGTTATGCCAACGGA20    (2) INFORMATION FOR SEQ ID NO:12:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 20 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: DNA primer    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:    GTTGCCCGCCCCTCCGATGT20    (2) INFORMATION FOR SEQ ID NO:13:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 20 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: DNA primer    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:    CCCAGCCCCGTGGTGGTGGG20    (2) INFORMATION FOR SEQ ID NO:14:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 20 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: DNA primer    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:    CCACAAGCAGGAGCAGACGC20    (2) INFORMATION FOR SEQ ID NO:15:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 19 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: DNA primer    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:    CCATGGCGTTAGTATGAGT19    (2) INFORMATION FOR SEQ ID NO:16:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 18 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: DNA primer    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:    GCAGGTCTACGAGACCTC18    (2) INFORMATION FOR SEQ ID NO:17:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 20 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: DNA primer    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:    TTCTGGAAGACGGCGTGAAC20    (2) INFORMATION FOR SEQ ID NO:18:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 20 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: DNA primer    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:    TCATCATATCCCATGCCATG20    (2) INFORMATION FOR SEQ ID NO:19:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 40 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: DNA probe    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19:    CCTTCACCATTGAGACAATCACGCTCCCCCAGGATGCTGT40    (2) INFORMATION FOR SEQ ID NO:20:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 40 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: DNA probe    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:20:    CTGTCCTGAGAGGCTAGCCAGCTGCCGACCCCTTACCGAT40    (2) INFORMATION FOR SEQ ID NO:21:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 40 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: DNA probe    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21:    AGGTCGGGCGCGCCCACCTACAGCTGGGGTGAAAATGATA40    (2) INFORMATION FOR SEQ ID NO:22:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 20 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: DNA probe    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:    GTGCAGCCTCCAGGACCCCC20    (2) INFORMATION FOR SEQ ID NO:23:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 20 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: DNA probe    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23:    CTCGTACACAATACTCGAGT20    (2) INFORMATION FOR SEQ ID NO:24:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 256 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24:    CCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCATA60    GTGGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGA120    TAAACCCGCTCAATGCCTGGAGATTTGGGCGCGCCCCCGCGAGACTGCTAGCCGAGTAGT180    GTTGGGTCGCGAAAGGCCTTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAG240    GTCTCGTAGACCGTGC256    (2) INFORMATION FOR SEQ ID NO:25:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 256 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:25:    CCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCATA60    GTGGTCTGCGGAGCCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGA120    TAAACCCGCTCAATGCCTGGAGATTTGGGCGCGCCCCCGCAAGACTGCTAGCCGAGTAGT180    GTTGGGTCGCGAAAGGCCTTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAG240    GTCTCGTAGACCGTGC256    (2) INFORMATION FOR SEQ ID NO:26:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 256 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26:    CCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCATA60    GTGGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGA120    TAAACCCGCTCAATGCCTGGAGATTTGGGCGCGCCCCCGCGAGACTGCTAGCCGAGTAGT180    GTTGGGTCGCGAAAGGCCTTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAG240    GTCTCGTAGACCGTGC256    (2) INFORMATION FOR SEQ ID NO:27:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 501 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27:    TTCTGGAAGACGGCGTGAACTATGCAACAGGGAACCTTCCTGGTTGCTCTTTCTCTATCT60    TCCTTCTGGCCCTGCTCTCTTGCTTGACTGTGCCCGCTTCGGCCTACCAAGTGCGCAATT120    CCACGGGGCTTTACCACGTCACCAATGATTGCCCTAACTCGAGTATTGTGTACGAGGCGG180    CCGATGCCATCCTGCACACTCCGGGGTGCGTCCCTTGCGTTCGTGAGGGCAACGCCTCGA240    GGTGTTGGGTGGCGATGACCCCTACGGTGGCCACCAGGGATGGAAGACTCCCCGCGACGC300    AGCTTCGACGTCACATCGATCTGCTTGTCGGGAGCGCCACCCTCTGTTCGGCCCTCTACG360    TGGGGGACCTATGCGGGTCTGTCTTTCTTGTCGGCCAATTGTTCACCTTCTCTCCCAGGC420    GCCACTGGACGACGCAAGGTTGCAATTGCTCTATCTATCCCGGCCATATAACGGGTCACC480    GCATGGCATGGGATATGATGA501    (2) INFORMATION FOR SEQ ID NO:28:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 501 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28:    TTCTGGAGGACGGCGTGAACTATGCAACAGGGAATTTGCCCGGTTGCTCTTTCTCTATCT60    TCCTCTTGGCTCTGCTGTCCTGTTTGACCATCCCAGCTTCCGCTTATGAAGTGCGCAACG120    TGTCCGGGATATACCATGTCACAAACGACTGCTCCAACTCAAGCATTGTGTATGAGGCGG180    CGGACGTGATCATGCATGCCCCCGGGTGCGTGCCCTGCGTTCGGGAGAACAATTCCTCCC240    GTTGCTGGGTAGCGCTCACTCCCACGCTCGCGGCCAGGAATGCCAGCGTCCCCACTACGA300    CATTACGACGCCACGTCGACTTGCTCGTTGGGACGGCTGCTTTCTGCTCCGCTATGTACG360    TGGGGGATCTCTGCGGATCTGTTTTCCTCATCTCCCAGCTGTTCACCTTCTCGCCTCGCC420    GGCATGAGACAGTACAGGACTGCAACTGCTCAATCTATCCCGGCCACGTATCAGGCCATC480    GCATGGCTTGGGATATGATGA501    (2) INFORMATION FOR SEQ ID NO:29:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 501 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29:    TTCTGGAAGACGGCGTGAACTATGCAACAGGGAACCTTCCTGGTTGCTCTTTCTCTATCT60    TCCTTCTGGCCCTGCTCTCTTGCCTGACTGTGCCCGCTTCAGCCTACCAAGTGCGCAACT120    CCACAGGGCTTTATCATGTCACCAATGATTGCCCTAACTCGAGTATTGTGTACGAGGCGC180    ACGATGCCATCCTGCATACTCCGGGGTGTGTCCCTTGCGTTCGCGAGGGCAACGTCTCGA240    GGTGTTGGGTGGCGATGACCCCCACGGTAGCCACCAGGGACGGAAGACTCCCCGCGACGC300    AGCTTCGACGTCACATCGATCTGCTTGTCGGGAGCGCCACCCTCTGTTCGGCCCTCTACG360    TGGGGGATCTGTGCGGGTCCGTCTTCCTTATTGGTCAACTGTTTACCTTCTCTCCCAGGC420    GCCACTGGACAACGCAAGGCTGCAATTGTTCTATCTACCCCGGCCATATAACGGGTCATC480    GCATGGCATGGGATATGATGA501    (2) INFORMATION FOR SEQ ID NO:30:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 501 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:30:    TTCTGGAGGACGGCGTGAACTATGCAACAGGGAACTTGCCCGGTTGCTCTTTCTCTATCT60    TCCTCTTGGCTTTGCTGTCCTGTTTGACCATCCCAGCTTCCGCTTATGAAGTGCGCAACG120    TGTCCGGGATATACCATGTCACGAACGACTGCTCCAACTCAAGCATTGTGTATGAGGCAG180    CGGACATGATCATGCATACTCCCGGGTGCGTGCCCTGCGTTCGGGAGGACAACAGCTCCC240    GTTGCTGGGTAGCGCTCACTCCCACGCTCGCGGCCAGGAATGCCAGCGTCCCCACTACGA300    CAATACGACGCCACGTCGACTTGCTCGTTGGGGCGGCTGCTTTCTGCTCCGCTATGTACG360    TGGGGGATCTCTGCGGATCTGTTTTCCTCGTCTCCCAGCTGTTCACCTTCTCGCCTCGCC420    GGCATGAGACAGTGCAGGACTGCAACTGCTCAATCTATCCCGGCCATTTATCAGGTCACC480    GCATGGCTTGGGATATGATGA501    (2) INFORMATION FOR SEQ ID NO:31:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 166 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: peptide    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:31:    LeuGluAspGlyValAsnTyrAlaThrGlyAsnLeuProGlyCysSer    151015    PheSerIlePheLeuLeuAlaLeuLeuSerCysLeuThrValProAla    202530    SerAlaTyrGlnValArgAsnSerThrGlyLeuTyrHisValThrAsn    354045    AspCysProAsnSerSerIleValTyrGluAlaAlaAspAlaIleLeu    505560    HisThrProGlyCysValProCysValArgGluGlyAsnAlaSerArg    65707580    CysTrpValAlaMetThrProThrValAlaThrArgAspGlyArgLeu    859095    ProAlaThrGlnLeuArgArgHisIleAspLeuLeuValGlySerAla    100105110    ThrLeuCysSerAlaLeuTyrValGlyAspLeuCysGlySerValPhe    115120125    LeuValGlyGlnLeuPheThrPheSerProArgArgHisTrpThrThr    130135140    GlnGlyCysAsnCysSerIleTyrProGlyHisIleThrGlyHisArg    145150155160    MetAlaTrpAspMetMet    165    (2) INFORMATION FOR SEQ ID NO:32:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 166 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: peptide    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32:    LeuGluAspGlyValAsnTyrAlaThrGlyAsnLeuProGlyCysSer    151015    PheSerIlePheLeuLeuAlaLeuLeuSerCysLeuThrIleProAla    202530    SerAlaTyrGluValArgAsnValSerGlyIleTyrHisValThrAsn    354045    AspCysSerAsnSerSerIleValTyrGluAlaAlaAspValIleMet    505560    HisAlaProGlyCysValProCysValArgGluAsnAsnSerSerArg    65707580    CysTrpValAlaLeuThrProThrLeuAlaAlaArgAsnAlaSerVal    859095    ProThrThrThrLeuArgArgHisValAspLeuLeuValGlyThrAla    100105110    AlaPheCysSerAlaMetTyrValGlyAspLeuCysGlySerValPhe    115120125    LeuIleSerGlnLeuPheThrPheSerProArgArgHisGluThrVal    130135140    GlnAspCysAsnCysSerIleTyrProGlyHisValSerGlyHisArg    145150155160    MetAlaTrpAspMetMet    165    (2) INFORMATION FOR SEQ ID NO:33:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 166 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: peptide    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33:    LeuGluAspGlyValAsnTyrAlaThrGlyAsnLeuProGlyCysSer    151015    PheSerIlePheLeuLeuAlaLeuLeuSerCysLeuThrValProAla    202530    SerAlaTyrGlnValArgAsnSerThrGlyLeuTyrHisValThrAsn    354045    AspCysProAsnSerSerIleValTyrGluAlaHisAspAlaIleLeu    505560    HisThrProGlyCysValProCysValArgGluGlyAsnValSerArg    65707580    CysTrpValAlaMetThrProThrValAlaThrArgAspGlyArgLeu    859095    ProAlaThrGlnLeuArgArgHisIleAspLeuLeuValGlySerAla    100105110    ThrLeuCysSerAlaLeuTyrValGlyAspLeuCysGlySerValPhe    115120125    LeuIleGlyGlnLeuPheThrPheSerProArgArgHisTrpThrThr    130135140    GlnGlyCysAsnCysSerIleTyrProGlyHisIleThrGlyHisArg    145150155160    MetAlaTrpAspMetMet    165    (2) INFORMATION FOR SEQ ID NO:34:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 166 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: peptide    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34:    LeuGluAspGlyValAsnTyrAlaThrGlyAsnLeuProGlyCysSer    151015    PheSerIlePheLeuLeuAlaLeuLeuSerCysLeuThrIleProAla    202530    SerAlaTyrGluValArgAsnValSerGlyIleTyrHisValThrAsn    354045    AspCysSerAsnSerSerIleValTyrGluAlaAlaAspMetIleMet    505560    HisThrProGlyCysValProCysValArgGluAspAsnSerSerArg    65707580    CysTrpValAlaLeuThrProThrLeuAlaAlaArgAsnAlaSerVal    859095    ProThrThrThrIleArgArgHisValAspLeuLeuValGlyAlaAla    100105110    AlaPheCysSerAlaMetTyrValGlyAspLeuCysGlySerValPhe    115120125    LeuValSerGlnLeuPheThrPheSerProArgArgHisGluThrVal    130135140    GlnAspCysAsnCysSerIleTyrProGlyHisLeuSerGlyHisArg    145150155160    MetAlaTrpAspMetMet    165    (2) INFORMATION FOR SEQ ID NO:35:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 1210 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:35:    AATGGCTCAGCTGCTCCGGATCCCACAAGCCATCTTGGACATGATCGCTGGTGCTCACTG60    GGGAGTCCTGGCGGGCATAGCGTATTTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGGT120    AGTGCTGCTGCTATTTGCCGGCGTCGACGCGGAAACCCACGTCACCGGGGGAAGTGCCGG180    CCACACTGTGTCTGGATTTGTTAGCCTCCTCGCACCAGGCGCCAAGCAGAACGTCCAGCT240    GATCAACACCAACGGCAGTTGGCACCTCAATAGCACGGCTCTGAACTGCAATGATAGCCT300    TAACACCGGCTGGTTGGCAGGGCTTTTCTATCACCACAAGTTCAACTCTTCAGGCTGTCC360    TGAGAGGCTAGCCAGCTGCCGACCCCTTACCGATTTTGACCAGGGCTGGGGCCCTATCAG420    TTATGCCAACGGAAGCGGCCCCGACCAGCGCCCCTACTGCTGGCACTACCCCCCAAAACC480    TTGCGGTATTGTGCCCGCGAAGAGTGTGTGTGGTCCGGTATATTGCTTCACTCCCAGCCC540    CGTGGTGGTGGGAACGACCGACAGGTCGGGCGCGCCCACCTACAGCTGGGGTGAAAATGA600    TACGGACGTCTTCGTCCTTAACAATACCAGGCCACCGCTGGGCAATTGGTTCGGTTGTAC660    CTGGATGAACTCAACTGGATTCACCAAAGTGTGCGGAGCGCCTCCTTGTGTCATCGGAGG720    GGCGGGCAACAACACCCTGCACTGCCCCACTGATTGCTTCCGCAAGCATCCGGACGCCAC780    ATACTCTCGGTGCGGCTCCGGTCCCTGGATCACACCCAGGTGCCTGGTCGACTACCCGTA840    TAGGCTTTGGCATTATCCTTGTACCATCAACTACACCATATTTAAAATCAGGATGTACGT900    GGGAGGGGTCGAACACAGGCTGGAAGCTGCCTGCAACTGGACGCGGGGCGAACGTTGCGA960    TCTGGAAGACAGGGACAGGTCCGAGCTCAGCCCGTTACTGCTGACCACTACACAGTGGCA1020    GGTCCTCCCGTGTTCCTTCACAACCCTACCAGCCTTGTCCACCGGCCTCATCCACCTCCA1080    CCAGAACATTGTGGACGTGCAGTACTTGTACGGGGTGGGGTCAAGCATCGCGTCCTGGGC1140    CATTAAGTGGGAGTACGTCGTTCTCCTGTTCCTTCTGCTTGCAGACGCGCGCGTCTGCTC1200    CTGCTTGTGG1210    (2) INFORMATION FOR SEQ ID NO:36:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 541 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:36:    AATGGCTCAGCTGCTCCGCATCCCACAAGCCATCTTGGATATGATCGCTGGTGCTCACTG60    GGGAGTCCTGGCGGGCATAGCGTATTTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGGT120    AGTGCTGTTGCTGTTTGCCGGCGTCGACGCGGAAACCATCGTCTCCGGGGGACAAGCCGC180    CCGCGCCATGTCTGGACTTGTTAGTCTCTTCACACCAGGCGCTAAGCAGAACATCCAGCT240    GATCAACACCAACGGCAGTTGGCACATCAATAGCACGGCCTTGAACTGCAATGAAAGCCT300    TAACACCGGCTGGTTAGCAGGGCTTATCTATCAACACAAATTCAACTCTTCGGGCTGTCC360    CGAGAGGTTGGCCAGCTGCCGACGCCTTACCGATTTTGACCAGGGCTGGGGCCCTATCAG420    TCATGCCAACGGAAGCGGCCCCGACCAACGCCCCTATTGTTGGCACTACCCCCCAAAACC480    TTGCGGTATCGTGCCCGCAAAGAGCGTATGTGGCCCGGTATATTGCTTCACTCCCAGCCC540    C541    (2) INFORMATION FOR SEQ ID NO:37:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 541 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:37:    GGTGTCGCAGTTGCTCCGGATCCCACAAGCTGTCGTGGACATGGTGGCGGGGGCCCACTG60    GGGAGTCCTGGCGGGCCTTGCCTACTATTCCATGGTAGGGAACTGGGCTAAGGTCCTGAT120    TGTGGCGCTACTCTTCGCCGGCGTTGACGGGGAGACCTACACGTCGGGGGGGGCGGCCAG180    CCACACCACCTCCACGCTCGCGTCCCTCTTCTCACCTGGGGCGTCTCAGAGAATCCAGCT240    TGTGAATACCAACGGCAGCTGGCACATCAACAGGACTGCCCTAAACTGCAATGACTCCCT300    CCACACTGGGTTCCTTGCCGCGCTGTTCTACACACACAGGTTCAACTCGTCCGGGTGCCC360    GGAGCGCATGGCCAGCTGCCGCCCCATTGACTGGTTCGCCCAGGGATGGGGCCCCATCAC420    CTATACTGAGCCTGACAGCCCGGATCAGAGGCCTTATTGCTGGCATTACGCGCCTCGACC480    GTGTGGTATCGTACCCGCGTCGCAGGTGTGTGGTCCAGTGTATTGCTTCACCCCAAGCCC540    T541    (2) INFORMATION FOR SEQ ID NO:38:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 325 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:38:    GGTGTCGCAGTTACTCCGGATCCCACAAGCTGTCATGGACATGGTGGCGGGGGCCCACTG60    GGGAGTCCTAGCGGGCCTTGCCTACTATTCCATGGTGGGGAACTGGGCTAAGGTTTTGAT120    TGTGATGCTACTCTTTGCCGGCGTTGACGGGCATACCCGCGTGACGGGGGGGGTGCAAGG180    CCACGTCACCTCTACACTCACGTCCCTCTTTAGACCTGGGGCGTCCCAGAAAATTCAGCT240    TGTAAACACCAATGGCAGTTGGCATATCAACAGGACTGCCCTGAACTGCAATGACTCCCT300    CCAAACTGGGTTCCTTGCCGCGCTG325    (2) INFORMATION FOR SEQ ID NO:39:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 403 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: peptide    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:39:    MetAlaGlnLeuLeuArgIleProGlnAlaIleLeuAspMetIleAla    151015    GlyAlaHisTrpGlyValLeuAlaGlyIleAlaTyrPheSerMetVal    202530    GlyAsnTrpAlaLysValLeuValValLeuLeuLeuPheAlaGlyVal    354045    AspAlaGluThrHisValThrGlyGlySerAlaGlyHisThrValSer    505560    GlyPheValSerLeuLeuAlaProGlyAlaLysGlnAsnValGlnLeu    65707580    IleAsnThrAsnGlySerTrpHisLeuAsnSerThrAlaLeuAsnCys    859095    AsnAspSerLeuAsnThrGlyTrpLeuAlaGlyLeuPheTyrHisHis    100105110    LysPheAsnSerSerGlyCysProGluArgLeuAlaSerCysArgPro    115120125    LeuThrAspPheAspGlnGlyTrpGlyProIleSerTyrAlaAsnGly    130135140    SerGlyProAspGlnArgProTyrCysTrpHisTyrProProLysPro    145150155160    CysGlyIleValProAlaLysSerValCysGlyProValTyrCysPhe    165170175    ThrProSerProValValValGlyThrThrAspArgSerGlyAlaPro    180185190    ThrTyrSerTrpGlyGluAsnAspThrAspValPheValLeuAsnAsn    195200205    ThrArgProProLeuGlyAsnTrpPheGlyCysThrTrpMetAsnSer    210215220    ThrGlyPheThrLysValCysGlyAlaProProCysValIleGlyGly    225230235240    AlaGlyAsnAsnThrLeuHisCysProThrAspCysPheArgLysHis    245250255    ProAspAlaThrTyrSerArgCysGlySerGlyProTrpIleThrPro    260265270    ArgCysLeuValAspTyrProTyrArgLeuTrpHisTyrProCysThr    275280285    IleAsnTyrThrIlePheLysIleArgMetTyrValGlyGlyValGlu    290295300    HisArgLeuGluAlaAlaCysAsnTrpThrArgGlyGluArgCysAsp    305310315320    LeuGluAspArgAspArgSerGluLeuSerProLeuLeuLeuThrThr    325330335    ThrGlnTrpGlnValLeuProCysSerPheThrThrLeuProAlaLeu    340345350    SerThrGlyLeuIleHisLeuHisGlnAsnIleValAspValGlnTyr    355360365    LeuTyrGlyValGlySerSerIleAlaSerTrpAlaIleLysTrpGlu    370375380    TyrValValLeuLeuPheLeuLeuLeuAlaAspAlaArgValCysSer    385390395400    CysLeuTrp    (2) INFORMATION FOR SEQ ID NO:40:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 180 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: peptide    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:40:    MetAlaGlnLeuLeuArgIleProGlnAlaIleLeuAspMetIleAla    151015    GlyAlaHisTrpGlyValLeuAlaGlyIleAlaTyrPheSerMetVal    202530    GlyAsnTrpAlaLysValLeuValValLeuLeuLeuPheAlaGlyVal    354045    AspAlaGluThrIleValSerGlyGlyGlnAlaAlaArgAlaMetSer    505560    GlyLeuValSerLeuPheThrProGlyAlaLysGlnAsnIleGlnLeu    65707580    IleAsnThrAsnGlySerTrpHisIleAsnSerThrAlaLeuAsnCys    859095    AsnGluSerLeuAsnThrGlyTrpLeuAlaGlyLeuIleTyrGlnHis    100105110    LysPheAsnSerSerGlyCysProGluArgLeuAlaSerCysArgArg    115120125    LeuThrAspPheAspGlnGlyTrpGlyProIleSerHisAlaAsnGly    130135140    SerAlaProAspGlnArgProTyrCysTrpHisTyrProProLysPro    145150155160    CysGlyIleValProAlaLysSerValCysGlyProValTyrCysPhe    165170175    ThrProSerPro    180    (2) INFORMATION FOR SEQ ID NO:41:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 180 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: peptide    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:41:    ValSerGlnLeuLeuArgIleProGlnAlaValValAspMetValAla    151015    GlyAlaHisTrpGlyValLeuAlaGlyLeuAlaTyrTyrSerMetVal    202530    GlyAsnTrpAlaLysValLeuIleValAlaLeuLeuPheAlaGlyVal    354045    AspGlyGluThrTyrThrSerGlyGlyAlaAlaSerHisThrThrSer    505560    ThrLeuAlaSerLeuPheSerProGlyAlaSerGlnArgIleGlnLeu    65707580    ValAsnThrAsnGlySerTrpHisIleAsnArgThrAlaLeuAsnCys    859095    AsnAspSerLeuHisThrGlyPheLeuAlaAlaLeuPheTyrThrHis    100105110    ArgPheAsnSerSerGlyCysProGluArgMetAlaSerCysArgPro    115120125    IleAspTrpPheAlaGlnGlyTrpGlyProIleThrTyrThrGluPro    130135140    AspSerProAspGlnArgProTyrCysTrpHisTyrAlaProArgPro    145150155160    CysGlyIleValProAlaSerGlnValCysGlyProValTyrCysPhe    165170175    ThrProSerPro    180    (2) INFORMATION FOR SEQ ID NO:42:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 108 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: peptide    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:42:    ValSerGlnLeuLeuArgIleProGlnAlaValMetAspMetValAla    151015    GlyAlaHisTrpGlyValLeuAlaGlyLeuAlaTyrTyrSerMetVal    202530    GlyAsnTrpAlaLysValLeuIleValMetLeuLeuPheAlaGlyVal    354045    AspGlyHisThrArgValThrGlyGlyValGlnGlyHisValThrSer    505560    ThrLeuThrSerLeuPheArgProGlyAlaSerGlnLysIleGlnLeu    65707580    ValAsnThrAsnGlySerTrpHisIleAsnArgThrAlaLeuAsnCys    859095    AsnAspSerLeuGlnThrGlyPheLeuAlaAlaLeu    100105    (2) INFORMATION FOR SEQ ID NO:43:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 943 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:43:    ACAATACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATTGAGA60    CAATCACGCTCCCCCAGGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGG120    GGAAGCCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACT180    CGTCCGTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCCG240    AGACTACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACC300    ATCTTGAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTAT360    CCCAGACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGT420    GCGCTAGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCC480    TCAAGCCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAATG540    AAATCACCCTGACGCACCCAGTCACCAAATACATCATGACATGCATGTCGGCCGACCTGG600    AGGTCGTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATT660    GCCTGTCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAA720    TCATACCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGC780    ACTTACCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCG840    GCCTCCTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACT900    GGCAAAAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTT943    (2) INFORMATION FOR SEQ ID NO:44:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 569 base pairs    (B) TYPE: nucleic acid    (C) STRANDEDNESS: single    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: Other    (A) DESCRIPTION: cDNA to genomic RNA    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:44:    GTAACACATGTGTCACTCAGACGGTCGATTTCAGCTTGGATCCCACTCTCACCATCGAGA60    CGACGACCGTGCCCCAAGATGCGGTTTCGCGCACGCAGCGGCGAGGTAGGACTGGCAGGG120    GCAGGAGAGGCATCTATAGGTTTGTGACTCCAGGAGAACGGCCCTCGGCGATGTTCGATT180    CTTCGGTCCTATGTGAGTGTTATGACGCGGGCTGTGCTTGGTATGAGCTCACGCCCGCTG240    AGACCTCGGTTAGGTTGCGGGCTTACCTAAATACACCAGGGTTGCCCGTCTGCCAGGACC300    ATCTGGAGTTCTGGGAGAGCGTCTTCACAGGCCTCACCCACATAGACGCCCACTTCTTGT360    CCCAGACTAAGCAGGCAGGAGACAACTTCCCCTACCTGGTAGCATACCAAGCCACAGTGT420    GCGCCAGGGCTAAGGCTCCACCTCCATCGTGGGATCAAATGTGGAAGTGTCTCATACGGC480    TAAAGCCTACGCTGCACGGGCCAACGCCCCTGCTGTATAGGCTAGGAGCCGTCCAGAATG540    AGGTCACCCTCACACACCCTATAACCAAA569    (2) INFORMATION FOR SEQ ID NO:45:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 313 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: peptide    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:45:    AsnThrCysValThrGlnThrValAspPheSerLeuAspProThrPhe    151015    ThrIleGluThrIleThrLeuProGlnAspAlaValSerArgThrGln    202530    ArgArgGlyArgThrGlyArgGlyLysProGlyIleTyrArgPheVal    354045    AlaProGlyGluArgProSerGlyMetPheAspSerSerValLeuCys    505560    GluCysTyrAspAlaGlyCysAlaTrpTyrGluLeuThrProAlaGlu    65707580    ThrThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProVal    859095    CysGlnAspHisLeuGluPheTrpGluGlyValPheThrGlyLeuThr    100105110    HisIleAspAlaHisPheLeuSerGlnThrLysGlnSerGlyGluAsn    115120125    LeuProTyrLeuValAlaTyrGlnAlaThrValCysAlaArgAlaGln    130135140    AlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeu    145150155160    LysProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAla    165170175    ValGlnAsnGluIleThrLeuThrHisProValThrLysTyrIleMet    180185190    ThrCysMetSerAlaAspLeuGluValValThrSerThrTrpValLeu    195200205    ValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCysLeuSerThrGly    210215220    CysValValIleValGlyArgValValLeuSerGlyLysProAlaIle    225230235240    IleProAspArgGluValLeuTyrArgGluPheAspGluMetGluGlu    245250255    CysSerGlnHisLeuProTyrIleGluGlnGlyMetMetLeuAlaGlu    260265270    GlnPheLysGlnLysAlaLeuGlyLeuLeuGlnThrAlaSerArgGln    275280285    AlaGluValIleAlaProAlaValGluThrAsnTrpGlnLysLeuGlu    290295300    ThrPheTrpAlaLysHisMetTrpAsn    305310    (2) INFORMATION FOR SEQ ID NO:46:    (i) SEQUENCE CHARACTERISTICS:    (A) LENGTH: 189 amino acids    (B) TYPE: amino acid    (D) TOPOLOGY: linear    (ii) MOLECULE TYPE: peptide    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:46:    AsnThrCysValThrGlnThrValAspPheSerLeuAspProThrLeu    151015    ThrIleGluThrThrThrValProGlnAspAlaValSerArgThrGln    202530    ArgArgGlyArgThrGlyArgGlyArgArgGlyIleTyrArgPheVal    354045    ThrProGlyGluArgProSerAlaMetPheAspSerSerValLeuCys    505560    GluCysTyrAspAlaGlyCysAlaTrpTyrGluLeuThrProAlaGlu    65707580    ThrSerValArgLeuArgAlaTyrLeuAsnThrProGlyLeuProVal    859095    CysGlnAspHisLeuGluPheTrpGluSerValPheThrGlyLeuThr    100105110    HisIleAspAlaHisPheLeuSerGlnThrLysGlnAlaGlyAspAsn    115120125    PheProTyrLeuValAlaTyrGlnAlaThrValCysAlaArgAlaLys    130135140    AlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeu    145150155160    LysProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAla    165170175    ValGlnAsnGluValThrLeuThrHisProIleThrLys    180185    __________________________________________________________________________

We claim:
 1. An oligonucleotide encoding a peptide, wherein said peptideis an amino acid (aa) sequence selected from the group consistingof:aa₅₈ to aa₆₆ of SEQ ID NO:3; aa₄₉ to aa₇₈ of SEQ ID NO:5; aa₁₂₃ toaa₁₃₃ of SEQ ID NO:5; SEQ ID NO:3; SEQ ID NO:5; and SEQ ID NO:7.
 2. Anoligonucleotide encoding a peptide, wherein said oligonucleotide is aDNA sequence selected from the group consisting of:a) n177 to n202 ofSEQ ID NO:2; b) n233 to n247 of SEQ ID NO:2; c) n254 to n272 of SEQ IDNO:2; d) n272 to n288 of SEQ ID NO:2; e) n156 to n170 of SEQ ID NO:4; f)n170 to n217 of SEQ ID NO:4; g) n310 to n334 of SEQ ID NO:4; h) SEQ IDNO:2; i) SEQ ID NO:4; and j) SEQ ID NO:6.
 3. An oligonucleotide encodinga peptide, wherein said oligonucleotide is a DNA sequence selected fromthe group consisting ofa) n118 to n138 of SEQ ID NO:2; and b) n267 ton283 of SEQ ID NO:4.
 4. An oligonucleotide probe comprising a DNAmolecule according to any one of claims 1, 2, or 3, wherein said DNAmolecule is labeled.
 5. An expression vector comprising a DNA moleculeor oligonucleotide as claimed in any one of claims 1, 2, or
 3. 6. Hosttransformed with a vector according to claim
 5. 7. Analytical kit forthe detection of nucleotide sequences of the hepatitis C viruscomprising one or more polynucleotide probe(s) according to claim
 4. 8.A process for preparing a polypeptide comprising:inserting a DNAmolecule as claimed in any one of claims 1, 2, or 3, encoding thepolypeptide into an expression vector; transforming cells with thisexpression vector comprising said inserted DNA molecule; and culturingsaid transformed cells.